NumPy is the fundamental package for scientific computing with Python. It contains among other things:
The NumPy array object is the common interface for working with typed arrays of data across a wide-variety of scientific Python packages. NumPy also features a C-API, which enables interfacing existing Fortran/C/C++ libraries with Python and NumPy.
In [ ]:
# Convention for import to get shortened namespace
import numpy as np
In [ ]:
# Create a simple array from a list of integers
a = np.array([1, 2, 3])
a
In [ ]:
# See how many dimensions the array has
a.ndim
In [ ]:
# Print out the shape attribute
a.shape
In [ ]:
# Print out the data type attribute
a.dtype
In [ ]:
# This time use a nested list of floats
a = np.array([[1., 2., 3., 4., 5.]])
a
In [ ]:
# See how many dimensions the array has
a.ndim
In [ ]:
# Print out the shape attribute
a.shape
In [ ]:
# Print out the data type attribute
a.dtype
NumPy also provides helper functions for generating arrays of data to save you typing for regularly spaced data.
arange(start, stop, interval)
creates a range of values in the interval [start,stop)
with step
spacing.linspace(start, stop, num)
creates a range of num
evenly spaced values over the range [start,stop]
.
In [ ]:
a = np.arange(5)
print(a)
In [ ]:
a = np.arange(3, 11)
print(a)
In [ ]:
a = np.arange(1, 10, 2)
print(a)
In [ ]:
b = np.linspace(5, 15, 5)
print(b)
In [ ]:
b = np.linspace(2.5, 10.25, 11)
print(b)
In [ ]:
a = range(5, 10)
b = [3 + i * 1.5/4 for i in range(5)]
In [ ]:
result = []
for x, y in zip(a, b):
result.append(x + y)
print(result)
That is very verbose and not very intuitive. Using NumPy this becomes:
In [ ]:
a = np.arange(5, 10)
b = np.linspace(3, 4.5, 5)
In [ ]:
a + b
The four major mathematical operations operate in the same way. They perform an element-by-element calculation of the two arrays. The two must be the same shape though!
In [ ]:
a * b
In [ ]:
np.pi
In [ ]:
np.e
In [ ]:
# This makes working with radians effortless!
t = np.arange(0, 2 * np.pi + np.pi / 4, np.pi / 4)
t
NumPy also has math functions that can operate on arrays. Similar to the math operations, these greatly simplify and speed up these operations. Be sure to checkout the listing of mathematical functions in the NumPy documentation.
In [ ]:
# Calculate the sine function
sin_t = np.sin(t)
print(sin_t)
In [ ]:
# Round to three decimal places
print(np.round(sin_t, 3))
In [ ]:
# Calculate the cosine function
cos_t = np.cos(t)
print(cos_t)
In [ ]:
# Convert radians to degrees
degrees = np.rad2deg(t)
print(degrees)
In [ ]:
# Integrate the sine function with the trapezoidal rule
sine_integral = np.trapz(sin_t, t)
print(np.round(sine_integral, 3))
In [ ]:
# Sum the values of the cosine
cos_sum = np.sum(cos_t)
print(cos_sum)
In [ ]:
# Calculate the cumulative sum of the cosine
cos_csum = np.cumsum(cos_t)
print(cos_csum)
In [ ]:
# Convention for import to get shortened namespace
import numpy as np
In [ ]:
# Create an array for testing
a = np.arange(12).reshape(3, 4)
In [ ]:
a
Indexing in Python is 0-based, so the command below looks for the 2nd item along the first dimension (row) and the 3rd along the second dimension (column).
In [ ]:
a[1, 2]
Can also just index on one dimension
In [ ]:
a[2]
Negative indices are also allowed, which permit indexing relative to the end of the array.
In [ ]:
a[0, -1]
Slicing syntax is written as start:stop[:step]
, where all numbers are optional.
It should be noted that end represents one past the last item; one can also think of it as a half open interval: [start, end)
In [ ]:
# Get the 2nd and 3rd rows
a[1:3]
In [ ]:
# All rows and 3rd column
a[:, 2]
In [ ]:
# ... can be used to replace one or more full slices
a[..., 2]
In [ ]:
# Slice every other row
a[::2]
In [ ]:
# Slice out every other column
a[:, ::2]
In [ ]:
# Slice every other item along each dimension -- how would we do this
Matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.
The first step is to set up our notebook environment so that matplotlib plots appear inline as images:
In [ ]:
%matplotlib inline
Next we import the matplotlib library's pyplot
interface. This is a MATLAB-like interface that makes generating plots relatively simple. To shorten this long name, we import it as plt
to keep things short but clear.
In [ ]:
import matplotlib.pyplot as plt
Now we generate some data to use while experimenting with plotting:
In [ ]:
times = np.array([ 93., 96., 99., 102., 105., 108., 111., 114., 117.,
120., 123., 126., 129., 132., 135., 138., 141., 144.,
147., 150., 153., 156., 159., 162.])
temps = np.array([310.7, 308.0, 296.4, 289.5, 288.5, 287.1, 301.1, 308.3,
311.5, 305.1, 295.6, 292.4, 290.4, 289.1, 299.4, 307.9,
316.6, 293.9, 291.2, 289.8, 287.1, 285.8, 303.3, 310.])
Now we come to two quick lines to create a plot. Matplotlib has two core objects: the Figure
and the Axes
. The Axes
is an individual plot with an x-axis, a y-axis, labels, etc; it has all of the various plotting methods we use. A Figure
holds one or more Axes
on which we draw.
Below the first line asks for a Figure
10 inches by 6 inches; matplotlib takes care of creating an Axes
on it for us. After that, we call plot
, with times
as the data along the x-axis (independant values) and temps
as the data along the y-axis (the dependant values).
In [ ]:
# Create a figure and an axes
fig, ax = plt.subplots(figsize=(10, 6))
# Plot times as x-variable and temperatures as y-variable
ax.plot(times, temps)
From there, we can do things like ask the axis to add labels for x and y:
In [ ]:
# Add some labels to the plot
ax.set_xlabel('Time')
ax.set_ylabel('Temperature')
# Prompt the notebook to re-display the figure after we modify it
fig
We can also add a title to the plot:
In [ ]:
ax.set_title('GFS Temperature Forecast', fontdict={'size':16})
fig
Of course, we can do so much more...
In [ ]:
# Set up more temperature data
temps_1000 = np.array([316.0, 316.3, 308.9, 304.0, 302.0, 300.8, 306.2, 309.8,
313.5, 313.3, 308.3, 304.9, 301.0, 299.2, 302.6, 309.0,
311.8, 304.7, 304.6, 301.8, 300.6, 299.9, 306.3, 311.3])
Here we call plot
more than once to plot multiple series of temperature on the same plot; when plotting we pass label
to plot
to facilitate automatic creation. This is added with the legend
call. We also add gridlines to the plot using the grid()
call.
In [ ]:
fig, ax = plt.subplots(figsize=(10, 6))
# Plot two series of data
# The label argument is used when generating a legend.
ax.plot(times, temps, label='Temperature (surface)')
ax.plot(times, temps_1000, label='Temperature (1000 mb)')
# Add labels and title
ax.set_xlabel('Time')
ax.set_ylabel('Temperature')
ax.set_title('Temperature Forecast')
# Add gridlines
ax.grid(True)
# Add a legend to the upper left corner of the plot
ax.legend(loc='upper left')
We're not restricted to the default look of the plots, but rather we can override style attributes, such as linestyle
and color
. color
can accept a wide array of options for color, such as red
or blue
or HTML color codes. Here we use some different shades of red taken from the Tableau color set in matplotlib, by using tab:red
for color.
In [ ]:
fig, ax = plt.subplots(figsize=(10, 6))
# Specify how our lines should look
ax.plot(times, temps, color='tab:red', label='Temperature (surface)')
ax.plot(times, temps_1000, color='tab:red', linestyle='--',
label='Temperature (isobaric level)')
# Same as above
ax.set_xlabel('Time')
ax.set_ylabel('Temperature')
ax.set_title('Temperature Forecast')
ax.grid(True)
ax.legend(loc='upper left')